Contamination cannot explain the lack of large-scale power in 
the cosmic microwave background radiation 
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Several anomalies appear to be present in the large-angle cosmic microwave background (CMB) 
anisotropy maps of WMAP. One of these is a lack of large-scale power. Because the data otherwise 
match standard models extremely well, it is natural to consider perturbations of the standard model 
as possible explanations. We show that, as long as the source of the perturbation is statistically 
independent of the source of the primary CMB anisotropy, no such model can explain this large-scale 
power deficit. On the contrary, any such perturbation always reduces the probability of obtaining 
any given low value of large-scale power. We rigorously prove this result when the lack of large-scale 
power is quantified with a quadratic statistic, such as the quadrupole moment. When a statistic 
based on the integrated square of the correlation function is used instead, we present strong numerical 
evidence in support of the result. The result applies to models in which the geometry of spacetime 
is perturbed (e.g., an ellipsoidal Universe) as well as explanations involving local contaminants, 
undiagnosed foregrounds, or systematic errors. Because the large-scale power deficit is arguably 
the most significant of the observed anomalies, explanations that worsen this discrepancy should be 
regarded with great skepticism, even if they help in explaining other anomalies such as multipole 
alignments. 
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I. INTRODUCTION 

Observations of cosmic microwave background (CMB) 
anisotropy, particularly the data from WMAP 0, @, 0] , 
have revolutionized cosmology. These observations are 
a major contributor to the emergence of a cosmological 
"standard model" of a Universe dominated by dark en- 
ergy and cold dark matter, with a nearly scale-invariant 
spectrum of Gaussian adiabatic perturbations [j| [f| . The 
overall consistency of the CMB data with this model is 
quite remarkable, but there appear to be some anomalies 
on the largest angular scales, such as a lack of large- 
|, alignment of low-order multipoles 
hemispheric asymmetries [H, G3 ■ 
The significance of and explanations for these puzzles 
are hotly debated. In particular, it is difficult to know 
how to interpret a posteriori statistical significances: 
when a statistic is invented to quantify an anomaly that 
has already been noticed, the low p- values for that statis- 
tic cannot be taken at face value. Nonetheless, the num- 
ber and nature of the anomalies (in particular, the fact 
that several seem to pick out the same directions on the 
sky) seem to suggest that there may be something to 
explain in the data. In this paper, we will tentatively as- 
sume that there is a need for an explanation and consider 
what that explanation might be. 

Since the standard model is in general highly consistent 
with the CMB and a wide variety of other observations, 
it is natural to look for explanations of these puzzles that 
consist of perturbations added onto the standard model. 
Such explanations can be based on nonstandard cosmolo- 



gies, such as ellipsoidal models [T3|, large-scale magnetic 
fields [l5| , and theories based on Bianchi Vllh spacetimes 
with rotation and shear [l6|, [l7| • They can also involve 
phenomena on much smaller scales (e.g.. fl8l Il9l . l20l. l2l| ) . 
perhaps even within the Solar System [22j |. Any uniag- 
nosed foreground contaminant would fall into the class 
of explanations we consider, as would many systematic 
errors. 

All of these models can be described by assuming that 
the observed CMB sky is the sum of two terms: 



T obs (r)=T (r)+T c (r), 



(1) 



'Electronic address: ebunn@richmond.edu 



where To is a Gaussian CMB sky with a power spectrum 
given by the standard model and T c is a contaminant. 
The contaminant can be a fixed function of sky position r 
or a realization of a random process. In the latter case, we 
assume nothing about the statistics of this process except 
that it is independent of the Gaussian random process 
that produced To- We wish to consider the possibility 
that such a model can explain some or all of the large- 
angle anomalies. 

In this article, we will present strong evidence that 
on the contrary all such models actually exacerbate one 
of the anomalies, namely the observed lack of power in 
the large-angular-scale CMB anisotropy. This anomaly 
is formally highly statistically significant, and as we will 
argue below it is one for which the problems of a poste- 
riori statistics are not particularly severe. It is therefore 
arguably the most in need of explanation of all of the 
large-angle CMB puzzles. We conclude, therefore, that 
this entire category of possible explanations should be 
regarded with great skepticism. In particular, the ab- 
sence of large-scale power in the WMAP data is in fact 
a strong argument against the existence of undiagnosed 
foreground contamination, as well as systematic errors 
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FIG. 1: The two-point correlation function for the WMAP 
data. The solid curve shows the correlation function for the 
three-year WMAP internal linear combination (ILC) data at 
resolution iV s id c = 32, computed with the SpICE software 
[24| . The dashed curves show the 95% confidence range in a 
set of simulations. The simulations virtually never produce 
correlation functions that are as close to zero at large angles 
as the real data. 



that would produce an additive contaminant to the ob- 
served sky maps. 

We can quantify the lack of power in the large-angle 
CMB by considering either the power in low-order multi- 
poles (especially the quadrupole) or a statistic based on 
the two-point angular correlation function (see Fig. [I}. 
In either case, the p-values (that is, the probabilities of 
getting as low a value of the chosen statistic as the one 
in the actual data) are low; in fact, for some choices of 
statistic, they are less than 0.1% Q (but see [23j for a 
constrasting analysis). By definition, for an alternative 
theory to explain this anomaly, it would have to gener- 
ate larger p-values. We will show in this paper that all 
proposed models of the form described above in fact re- 
duce the p-values based on these statistics. Therefore, 
although such models might alleviate some of the other 
large-angle anomalies, they worsen this one. 

At one level, this is not surprising. For the models con- 
sidered here, in which the observations are the sum of two 
statistically independent terms, the observed power spec- 
trum is simply the sum of the standard-model spectrum 
and the spectrum of the contaminant. Addition of the 
contaminant therefore biases all multipoles up, includ- 
ing the quadrupole. This is merely a statement about 
mean-square values, however, and does not tell us about 
the probability distribution of the multipoles. It is log- 
ically possible that a (non-Gaussian) contaminant, even 
as it biases the mean-square quadrupole up, widens the 
probability distribution for the quadrupole in such a way 
as to enhance the probability of getting low values. In- 
deed, any proposal to explain the lack of large-scale power 
through a perturbation to the standard model must be 



proposing such an effect, since this is what it would mean 
to "explain" the discrepancy. 

For example, suggestions have been made that the low 
quadrupole might be explained by an extended local fore- 
ground [2(j, by dust-filled local voids 1 [H,[l£j, or by an 
"ellipsoidal" universe that expands at different rates in 
different directions Each such explanation assumes 
that a chance anticorrelation between the contaminant 
and the intrinsic CMB anisotropy has occurred. In order 
for this to count as an explanation, however, such an an- 
ticorrelation must be sufficiently probable that it raises 
the probability of finding the observed lack of power. Al- 
though this is a logical possibility, we will argue below 
that it in fact never occurs, whether the lack of power is 
quantified via the quadrupole moment or the correlation 
function. For some specific cases, such as the quadrupole 
moment in an ellipsoidal universe, previous work [25| has 
already established this; in this paper we prove it in gen- 
eral. In summary, such models cannot explain the lack 
of large-scale power, and in fact always "anti-explain" it 
by reducing the already-low probability. 

Section [IT] proves this general result in the case where 
the lack of power is quantified via a quadratic estimator 
such as the mean-square quadrupole moment. Section lLLTl 
presents strong numerical evidence that the result is also 
true in the case of a statistic based on the two-point cor- 
relation function. Section llVI contains a brief discussion 
of the results, and an appendix proves a key mathemat- 
ical result needed in section [TT1 



II. QUADRATIC POWER ESTIMATORS 

As noted above, the observed lack of large-scale power 
in the CMB can be quantified in different ways. The sim- 
plest, going all the way back to the COBE observations 
[26l |27| . is to compute an estimator of the quadrupole 
power Ci — (|a2 m | 2 ), where a; m is a coefficient in a 
spherical harmonic expansion. Quadrupole estimators 
applied to the WMAP data are lower than theoretical 
predictions, although due to the large cosmic variance, 
the significance of this anomaly is only ~ 5% [8|], which 
is weaker than the correlation function statistic described 
in the next section. Nonetheless, because the quadrupole 
is one of the simplest and most natural ways to quantify 
large-scale power, we consider it in detail in this section. 
In particular, we will demonstrate that any statistically 
independent contaminant exacerbates the problem of an 
anomlaously low quadrupole. 



A suggestion is made in the cited work that the hypothesis that 
the contaminant is uncorrelated with the primary signal may 
not apply. If this is true, then the arguments in the present 
paper would not apply to this model. It is not clear to us that a 
strong correlation of the proposed form exists in the model under 
consideration, and as far as we know no detailed calculation of 
this effect has been performed. 
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FIG. 2: Cumulative probability distributions for the quadrupole power C2 in an ellipsoidal Universe, in dimensionless AT/T 
units. In the left panel, data from the entire sky were used, and in the right panel the WMAP KpO cut was applied. The 
solid curve shows a standard LCDM model with no eccentricity. From bottom to top, the other three curves correspond to 
eccentricities 5 x 10 -3 , 6.2 x 1CT 3 , 7.4 x 1CF 3 . The horizontal line shows the quadrupole found in the actual data. 



The quadrupole power is a positive definite quadratic 
function q 2 of the data. As noted in the previous sec- 
tion, a contaminant always causes an upward bias in the 
expectation value of such a statistic; to be precise, the 
expectation value is (q 2 ) = (q^) + (q 2 ), where the two 
terms on the right are the expectation values due to the 
to contributors To, T c . As noted in the previous section, 
however, this statement is not sufficient to justify the 
claim that adding a contaminant always exacerbates the 
problem of a low quadrupole. We need to show that the 
probability of getting a low quadrupole is always reduced 
by adding a contaminant - that is, for any given value 
q 2 , the probability that the observed value is less than q 2 
is always lower with a contaminant than without. 

Let the vector y represent a list of data points that we 
will use to estimate the large- angle power in the CMB, for 
example, the pixelized temperature values in the WMAP 
data. Let q 2 be a positive definite quadratic function of 
the data (possibly with some noise bias removed): 

q 2 (y) =yA-y-6. (2) 

Here A is a symmetric nonnegative definite matrix, and 
the noise bias 6 is a constant. 

We want to compare the null hypothesis, that y con- 
tains only intrinsic CMB anisotropy and noise, with the 
hypothesis that there is an additional statistically inde- 
pendent contaminant. We can express these possibilities 
by writing 

y = x + c, (3) 

where x is the "uncontaminated" data (including noise) 
and c represents a hypothetical contaminant. We assume 
that x is drawn from a multivariate Gaussian distribu- 
tion: 

/ x (x) ocexp f-ix-M-xj (4) 



for some symmetric positive definite matrix M. For the 
null hypothesis, we set c = 0. When considering contami- 
nation, we assume c is a random variable with some prob- 
ability density f c . (This formulation includes the possi- 
bility that c is a fixed contaminant - that is, f c is allowed 
to be a delta function.) No assumption is made about f c 
other than independence of x and c, which means that 
the joint probability density factors: 

/(x,c)=/ x (x)/ c (c). (5) 

Let y be the data actually measured, and let q 2 = 
q 2 (y) stand for the power estimate obtained from it. Let 
P c stand for the probability of getting a value of q 2 as 
low as the true value, assuming a fixed value for the con- 
taminant c: 

= Pr[<? 2 (y) < f I c] = / dx/ x (x), (6) 

where the volume V is the ellipsoid consisting of all y 
with y • A • y < q 2 + b. 

Note that P c is an integral over an ellipsoid centered 
at x = c. Since the integrand peaks at the origin, wc 
would expect P c to be maximized when the ellipsoid's 
center is placed at the origin. To be specific, we expect 
that 

P c < P . (7) 

This expectation is indeed correct; a proof of it may be 
found in the Appendix. 

This means that adding any fixed contaminant c al- 
ways reduces the probability of getting a low q 2 . As a 
consequence, even if c is not fixed but is generated by 
some random process, the probability is still lower than 
in the case c = 0. Formally, we can write 

Pv[q 2 (y) < q 2 } = J dcPr[g 2 (y) < q 2 | c]/ c (c). (8) 
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FIG. 3: Cumulative probability distributions for the statistic S1/2. As in Figure [2] the left panel is for full-sky data, while the 
right panel is for the KpO cut. Curves are as in the previous figure. 



Using inequality 0, 

Pr[q 2 (y) < q 2 } < P J f c (c) dc = P . (9) 

This inequality is the central result of this section. It 
means that, if the power is anomalously low under the 
assumption of no contamination, then introducing a con- 
taminant can only make the problem worse. 

Figure[2]illustrates this conclusion for the case of an el- 
lipsoidal Universe. The figure shows the cumulative prob- 
ability distribution of the quadrupole power C2, based 
on 1000 simulations of the CMB sky. The simulations 
were performed using HEALPix [28| with A^ide = 32. 
The solid curve shows the distribution for a Gaussian 
CMB with the power spectrum given by the best-fit 
LCDM model from the three-year WMAP data [29[. 
From bottom to top, the other three curves show mod- 
els with the same power spectrum but with eccentricities 
5 x 10~ 3 , 6.2 x 10~ 3 , 7.4 x 10~ 3 . According to the analysis 
of ref. [14| , eccentricities in this range provide a better fit 
to the CMB quadrupole than the standard model; how- 
ever, as Gruppuso [25j has pointed out, the calculations 
in ref. [l4| do not properly account for all possible relative 
orientations of the ellipticity axis and the intrinsic CMB 
anisotropy and hence overestimate the goodness of fit of 
the ellipsoidal models. The horizontal line indicates the 
value found in the actual WMAP data (specifically, the 
three-year internal linear combination data, downgraded 
to N s id e — 32). The curves in the left panel were com- 
puted using the entire sky, while the right curves were 
computed using the WMAP KpO cut 0. 

The figure illustrates that the probability of getting 
a quadrupole value below any given cutoff strictly de- 
creases as the size of the perturbation increases. As 
predicted by inequality ([9]), the way to get the highest 
probability is to have no perturbation at all. In par- 
ticular, for the no-cut data, the probability of getting a 
value as small as the actual data is ~ 5% in the standard 



model and dropts to ~ 3%, 1.5%, 0.2% as the ellipticity 
increases. When the KpO mask is applied, the probabil- 
ities are lower in all cases than in the full-sky case, but 
the same decrease in probability is observed. These con- 
clusions are consistent with those of ref. [2|| , but we have 
established the conclusion for a much broader category 
of theories, not just this specific case. 

III. CORRELATION FUNCTION 

The low quadrupole does not have particularly high 
statistical significance, largely because of the high level of 
cosmic variance in the quadrupole. The two-point angu- 
lar correlation function provides a much more signficant 
indication that there is an anomalous lack of large-scale 
power in the WMAP data. In particular, the integrated 
square of the correlation function, 

,1/2 

S 1/2 = J ^ [C{9)] 2 d cos 8, (10) 

which was first introduced in the analysis of the 1-year 
WMAP data [H, is extremely low in the WMAP data 
in comparison with theoretical estimates, with p-value 
of order 0.1% 0. Here C{6) is the two-point correla- 
tion function, that is, the average of all pairs of pixels 
with angular separation 9. We wish to examine whether 
adding a perturbation to the standard model can solve 
this problem (that is, raise the probability of getting the 
observed low value of Si/2)- 

Since the statistic Si/ 2 is quartic, not quadratic, in 
the data, the argument of the previous section does not 
apply to it. However, it is extremely plausible to suppose 
that a similar conclusion should hold, since any model 
with a high probability of producing low values of this 
statistic would presumably produce low values of the low- 
order multipoles, and since any contaminant reduces the 
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FIG. 4: Cumulative probability distributions for the statistic S1/2 for Bianchi Vllh (rotating Universe) models. The values of 
the shear are (solid), 2.4 x 10" 10 (dotted), 5 x 10" 10 (dashed), and 1 x 1CT 9 (dot-dashed). As in the previous figures, no-cut 
probabilities are shown on the left, and KpO-cut probabilities are shown on the right. 



probability of such low multipoles. 

We can of course test this conjecture numerically for 
any particular model. For example, Figure [3] shows the 
results of simulations precisely like those shown in Fig- 
ure but with the statistic S1/2 used in place of the 
quadrupole. The SpICE software [24| was used to com- 
pute the correlation functions. Figure 2] shows the results 
of similar calculations, for the case of a model in which 
the spacetime geometry is that of a rotating Bianchi Vllh 
model 17]- We have also performed computations for 



models in which the contaminant consists of circular hot 
and cold spots of varying amplitudes and radii, to simu- 
late the effects of local voids or similar features. In all of 
these cases, the addition of a contaminant does not solve 
the problem of the lack of large scale power; in fact, it 
worsens it. 

Rather than examining theories one at a time, it would 
clearly be better to have a general argument that applied 
to a broad class of theories. In the rest of this section, 
we provide such an argument. 

Suppose that the value of Si/ 2 for the actual data is 

Si/2- Let V be the volume in the data space that yields 
values of the statistic this low: 



V={y\ 5 1/2 (y) < S 1/2 } 



(11) 



Then, assuming a contaminant given by a fixed vector c, 
the probability of getting such a low value of the statistic 
is 

P(c)= [ /x(x)dx= / f x (y-c)dy. (12) 

J(x+c)eV JyEV 

We want to know whether there are any vectors c such 
that P(c) > -P(O), or in other words whether P has a 
global maximum at c = 0. It is straightforward to check 
that V-P(O) = 0. We next consider whether the point 
c = is a maximum, a minimum, or a saddle point. If 
we find that it is a maximum, then the addition of any 
small contaminant worsens the problem we are trying to 
solve. 



To answer this question, we naturally consider the ma- 
trix of second derivatives: 



H 



dP 



dcjdck 



(13) 



Then P has a local maximum at the origin if and only 
if H is positive definite. Moreover, if H is not positive 
definite, then the eigenvectors corresponding to negative 
eigenvalues yield the directions in data space (i.e., par- 
ticular forms for the contaminant c) that alleviate the 
problem of low Si/ 2 - 

To calculate these derivatives, it is convenient to trans- 
form the data to a basis that diagonalizes the covariance 
matrix in the Gaussian probability density f x . The most 
natural way to accomplish this is to work in the spher- 
ical harmonic basis, in which case each data point is a 

coefficient a/ m . We can normalize each data point ac- 

1/2 

cording to the power spectrum, setting Xj — aim/Cj^ , 
where the index j runs over all pairs Im. In this case the 
covariance matrix is simply the identity matrix, and the 
second derivative matrix elements can be written 



Hjk = - j dxf x (x.)(xjX k - S jk ). 
Jv 



(14) 



This integral over the many-dimensional data space 
can most easily be be estimated by Monte Carlo integra- 
tion. To be specific, we draw vectors x from the appro- 
priate Gaussian distribution, calculate the corresponding 
values of S1/2, and use the results to throw away all vec- 
tors that lie outside of V. For all the rest, we average 
together the quantities (xjXk — Sjk)- 

In performing this Monte Carlo integration, we con- 
sider HcalPIX maps with iV s id e = 32 and the same power 
spectrum as in the previous section. We apply Gaussian 
smoothing with a 20° FWHM beam to the simulated 
maps. This amount of smoothing results in significant 
suppression (by more than e _1 ) of spherical harmonics 
coefficients I > 10. Without significant smoothing, fluc- 
tuations in high-^ modes cause significant error in the 
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FIG. 5: Eigenvalues of the second derivative matrix H. Two 
independent calculations of the matrix were performed, each 
based on 80 000 simulations using the KpO cut. The matrices 
were truncated to include only multipoles I = 2 through 8. 
The solid curve shows the eigenvalues computed from one 
matrix, sorted from largest to smallest. The dashed curve 
shows the quantities v ■ H ■ v, where v are the eigenvectors 
computed from the first matrix and H is the second matrix. 
The difference between the two curves gives an indication of 
the numerical error in the Monte Carlo integration. 



Monte Carlo calculation even at low I. The problem of 
anomalously low S\/2 persists at about the same signif- 
icance (p-values ~ 0.1%) even with such smoothing, so 
this smoothing does not weaken our ability to draw con- 
clusions about possible explanations for the anomaly. 

Figure [5] shows the eigenvalues of the matrix resulting 
from this Monte Carlo integration, using the KpO mask. 
The results look similar when data from the full sky are 
used. The matrix used to compute the eigenvalues was 
based on 80 000 simulations lying within the volume V. 
Modes up to I — 8 were used to compute the eigenvalues 
shown in the figure, although modes up to I — 64 (far 
above the beam scale) were used in the simulations. To 
test the numerical stability of the results, we used a sec- 
ond set of 80 000 simulations to recompute the matrix H. 
We then calculated v • H ■ v for each eigenvector v. The 
results are shown in the dashed curve. In the absence of 
numerical error, the two curves would be identical. 

Although there is some numerical error due to the 
Monte Carlo integration, it appears that the matrix is 
not positive definite. We wish to examine the eigen- 
vectors corresponding to the most negative eigenvalues, 
since these describe particular contaminants that might 
solve the problem of a lack of large-scale power. Figure 
[5] shows the particular pattern on the sky corresponding 
to the most negative eigenvalue. Most of the power in 
this contaminant is found in multipole I = 5, as is the 
power in all of the most negative eigenvectors. To test 
the robustness of this pattern, we computed the eigen- 
vectors retaining varying numbers of modes in the matrix 
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FIG. 6: The sky pattern corresponding to the most negative 
eigenvalue of H. 

H, ranging from Z max = 5 to 15, and also using varying 
subsets of the Monte Carlos to compute the matrix. The 
results are quite consistent, with the most negative eigen- 
vectors always having most of their power at I = 5 and 
looking quite similar to Fig. [SJ 

The existence of these negative eigenvalues seems to 
contradict our assertion that no contaminant can explain 
the low value of S\/2'- modes such as the one shown in 
Fig. O by construction, raise the probability of getting 
a low value when added to the data. However, when we 
assess the amount of improvement that these modes can 
provide, we find it to be negligible. Consider a model 
in which we add a contaminant of the form shown in 
Fig.[S]with some amplitude a to the standard model. The 
results of this section have shown that the probability of 
getting a low Si/ 2 is an increasing function of a at low 
a. However, because the eigenvalue is fairly small, the 
increase might be expected to be slight. Furthermore, for 
sufficiently large value of a, the probability must start to 
decrease again. 

Fig. [7] shows that this is indeed the case, and further- 
more that no choice of a leads to a significant increase 
in the probability of getting a value as low as the actual 
data. This probability remains virtually unchanged at 
~ 10~ 3 for small a and then decreases dramatically for 
larger a. Since all of the eigenvectors corresponding to 
significantly negative eigenvalues of H give patterns quite 
similar to this one, we can conclude with confidence that 
no such pattern can significantly alleviate the problem of 

low Si/2- 

IV. CONCLUSIONS 

We have considered a broad class of cosmological mod- 
els, obtained by adding a contaminant to the standard 
best-fit inflation-based model. The only assumption we 
have made about the contaminant is that it is statistically 
independent of the cosmological signal. We have argued 
that all such models exacerbate rather than alleviating 
the lack of large-scale power in the WMAP data. Wc 
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FIG. 7: Cumulative probability distributions for models in 
which a fixed contaminant of the form shown in Fig. [6] is 
added to the sky. This pattern corresponds to the most neg- 
ative eigenvalue of the matrix H and so might be expected to 
increase the probability of finding a low value of Si/ 2 ■ The 
curves shown in the figure are for varying amplitudes of the 
contaminant, with root-mean-square pixel values of (solid) 
2fiK (dotted), 4fiK (dashed), 8/xK (dot-dashed). No value 
of the amplitude causes the probability of getting values as 
low as those in the real data to increase noticeably. 



hve proven this result to be true when the lack of power 
is quantified by the quadrupole moment and have pre- 
sented strong numerical evidence in support of it when 
the two-point correlation function is used. Since the lat- 
ter in particular is discrepant at a highly significant level 
already any theory that worsens this discrepancy should 
be regarded with great skepticism. 

In addition to exotic cosmologies such as models with 
a global ellipsoidal anisotropy, the class of models consid- 
ered herein includes more mundane possibilities such as 
undiagnosed foregrounds and many systematic errors. In 
particular, since several of the observed anomalies seem 
to "pick out" the ecliptic plane as a preferred direction, 
some attention has focused on a local foreground as a pos- 
sible explanation. The calculations presented here argue 
against such models. 

In any particular model with a contaminant, of course, 
it is possible that a chance cancellation between the con- 
taminant and the intrinsic CMB anisotropy can occur, 
leading to the observed lack of large-scale power. What 
we have shown is that such a chance cancellation is al- 
ways unlikely, and in particular that it is always more un- 
likely than the lack of large-scale power occurring based 
on cosmic variance alone, without a contaminant. 

The question of how seriously to take the various large- 
angle CMB anomalies, including the lack of large-scale 
power as well as the various other puzzles, has been much 
debated [3l|. In particular, because they are all based 
on a posteriori statistics (i.e., on statistical significances 
calculated after the anomalies had already been noted), 



the quoted significances cannot be taken at face value. 
Arguably, however, the large-scale power deficit suffers 
less from this problem of a posteriori statistics. After 
all, for virtually the entire existence of the field of CMB 
anisotropy studies, the two-point correlation function has 
been regarded as one of the most natural statistics to use 
in quantifying the level of structure in CMB maps as a 
function of angular scale. For instance, upper limits on 
CMB anisotropy in the pre-COBE era were usually pre- 
sented as limits on the correlation function. Although 
the particular statistic S1/2 is an a posteriori invention, 
it merely quantifies the mean-square level of this func- 
tion, which was already regarded a priori as a natural 
function to compute. Although one can certainly dis- 
pute the extent to which the significance of the lack of 
large-angle correlations is an artifact of the particular 
choice of statistic (for instance, see [Hj], who do not use 
the S1/2 statistic and find less significant discrepancies), 
nonetheless we believe that, of all the observed anoma- 
lies, the large-scale power deficit is one of the most in 
need of explanation. 

Anomalies in a data set naturally prompt thoughts 
of systematic errors or contaminants in the data. Per- 
haps counterintuitively, this particular anomaly provides 
a strong argument against such possibilities. In particu- 
lar, a foreground that was not removed from the data 
(due to having a spectrum indistinguishable from the 
CMB, for example) would fall precisely into the category 
considered herein. Note, however, that if the foreground 
removal procedure itself removes part of the cosmological 
signal, the resulting error would not fall into the cate- 
gory considered herein. In particular, the ILC method 
does project out some of the intrinsic CMB signal and so 
in principle does reduce the amount of large-scale power. 
This effect is calculable and has been found to be negli- 
gible, however. 

There are of course a wide variety of possible expla- 
nations for the anomalies that do not fall into the cate- 
gory considered here. For example, simply modifying the 
primordial power spectrum at large scales naturally al- 
leviates the problem of a lack of large-scale CMB power 
(e.g., I33L l34 l ). Some models with nontrivial topology 
(e.g., [3a, Ha H3, [H, H§|) also have this effect, although 
such models have other problems 40] . The framework of 
spontaneous isotropy breaking [4l[ also provides a class 
of models that are not based on simply adding a pertur- 
bation to the standard cosmology. Models such as these 
(and many others) may provide an explanation for the 
puzzles in the large-angle CMB. 
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APPENDIX: PROOF OF EQUATION © 

Wc first express equation ([6]) in terms of the integration 
variable y = x + c: 



Pc.= I dy/x(y 



(15) 



We next apply a linear coordinate transformation that 
maps the ellipsoid V onto the unit sphere. To be specific, 
we find a matrix L such that A = LL T (e.g., by Cholesky 

decomposition). We define y' = (L T • y)/\/<7o + ^ an< ^ c ' 
similarly. Then 



dy' f x >(y' -c') 



(16) 



y'l 2 <i 



Here / x ' is a multivariate Gaussian probability density 
with a new inverse covariance matrix M', and the pro- 
portionality constant is determined by the Jacobian of 
the coordinate transformation. For convenience, we now 
make yet another coordinate transformation: we apply a 
rotation that diagonalizes M'. The result is 



P c oc / dy"cxp(-]T 



y"l<i 



J'\2 



2af 



(17) 



For the remainder of this section we drop the double 
primes. 



We now show that P c has a maximum at c = 0. Dif- 
ferentiate the above expression for P c with respect to c\ : 



dP c 

dc 



- oc / dy exp ( - V 
i J\y\<i V ^ 



(jji -Cj) 2 \ y x - ci 




dy u (19) 



\ 1/2 

where Y\ = fl — Yh=2 Vi) • Performing the y\ integral 
yields 



dP c 

dc 



I A A ( V- (Vi-«) 2 \ 

- oc J dy 2 ---dy n exp I — ^ — ^? — I 



-(Yi+Cif/Sa 2 



-(Yi- C1 ) 2 /2<r 2 



(20) 



The integrand (and hence the integral) is strictly positive 
for c\ < and negative for ci > 0. That is, for any 
fixed values of C2,...,c„, the function P c has its only 
maximum at c\ = 0. The same argument applies to each 
of the other Cj. Hence P c has a global maximum at c = 0. 
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